Abstraction and approximate decision-theoretic planning
Identifieur interne : 000118 ( Main/Exploration ); précédent : 000117; suivant : 000119Abstraction and approximate decision-theoretic planning
Auteurs : Richard Dearden [Canada] ; Craig Boutilier [Canada]Source :
- Artificial Intelligence [ 0004-3702 ] ; 1997.
Descripteurs français
- Wicri :
- topic : Perquisition.
English descriptors
- KwdEn :
Abstract
Markov decision processes (MDPs) have recently been proposed as useful conceptual models for understanding decision-theoretic planning. However, the utility of the associated computational methods remains open to question: most algorithms for computing optimal policies require explicit enumeration of the state space of the planning problem. We propose an abstraction technique for MDPs that allows approximately optimal solutions to be computed quickly. Abstractions are generated automatically, using an intensional representation of the planning problem (probabilistic strips rules) to determine the most relevant problem features and optimally solving a reduced problem based on these relevant features. The key features of our method are: abstractions can be generated quickly; the abstract solution can be applied directly to the original problem; and the loss of optimality can be bounded. We also describe methods by which the abstract solution can be viewed as a set of default reactions that can be improved incrementally, and used as a heuristic for search-based planning or other MDP methods. Finally, we discuss certain difficulties that point toward other forms of aggregation for MDPs.
Url:
DOI: 10.1016/S0004-3702(96)00023-9
Affiliations:
Links toward previous steps (curation, corpus...)
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Abstraction and approximate decision-theoretic planning</title>
<author><name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
</author>
<author><name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:A82685ED61F587C143A9F32E055C63BDAEF7DE44</idno>
<date when="1997" year="1997">1997</date>
<idno type="doi">10.1016/S0004-3702(96)00023-9</idno>
<idno type="url">https://api.istex.fr/document/A82685ED61F587C143A9F32E055C63BDAEF7DE44/fulltext/pdf</idno>
<idno type="wicri:Area/Main/Corpus">000356</idno>
<idno type="wicri:Area/Main/Curation">000321</idno>
<idno type="wicri:Area/Main/Exploration">000118</idno>
<idno type="wicri:explorRef" wicri:stream="Main" wicri:step="Exploration">000118</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Abstraction and approximate decision-theoretic planning</title>
<author><name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
<affiliation wicri:level="4"><country>Canada</country>
<wicri:regionArea>Department of Computer Science, University of British Columbia, Vancouver, BC</wicri:regionArea>
<orgName type="university">Université de la Colombie-Britannique</orgName>
<placeName><settlement type="city">Vancouver</settlement>
<region type="state">Colombie-Britannique </region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Canada</country>
</affiliation>
</author>
<author><name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
<affiliation wicri:level="4"><country>Canada</country>
<wicri:regionArea>Department of Computer Science, University of British Columbia, Vancouver, BC</wicri:regionArea>
<orgName type="university">Université de la Colombie-Britannique</orgName>
<placeName><settlement type="city">Vancouver</settlement>
<region type="state">Colombie-Britannique </region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Canada</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Artificial Intelligence</title>
<title level="j" type="abbrev">ARTINT</title>
<idno type="ISSN">0004-3702</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="1997">1997</date>
<biblScope unit="volume">89</biblScope>
<biblScope unit="issue">1–2</biblScope>
<biblScope unit="page" from="219">219</biblScope>
<biblScope unit="page" to="283">283</biblScope>
</imprint>
<idno type="ISSN">0004-3702</idno>
</series>
<idno type="istex">A82685ED61F587C143A9F32E055C63BDAEF7DE44</idno>
<idno type="DOI">10.1016/S0004-3702(96)00023-9</idno>
<idno type="PII">S0004-3702(96)00023-9</idno>
<idno type="ArticleID">96000239</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0004-3702</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Abstraction</term>
<term>Approximation</term>
<term>Decision theory</term>
<term>Execution</term>
<term>Heuristics</term>
<term>Markov decision processes</term>
<term>Planning</term>
<term>Search</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Perquisition</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Markov decision processes (MDPs) have recently been proposed as useful conceptual models for understanding decision-theoretic planning. However, the utility of the associated computational methods remains open to question: most algorithms for computing optimal policies require explicit enumeration of the state space of the planning problem. We propose an abstraction technique for MDPs that allows approximately optimal solutions to be computed quickly. Abstractions are generated automatically, using an intensional representation of the planning problem (probabilistic strips rules) to determine the most relevant problem features and optimally solving a reduced problem based on these relevant features. The key features of our method are: abstractions can be generated quickly; the abstract solution can be applied directly to the original problem; and the loss of optimality can be bounded. We also describe methods by which the abstract solution can be viewed as a set of default reactions that can be improved incrementally, and used as a heuristic for search-based planning or other MDP methods. Finally, we discuss certain difficulties that point toward other forms of aggregation for MDPs.</div>
</front>
</TEI>
<affiliations><list><country><li>Canada</li>
</country>
<region><li>Colombie-Britannique </li>
</region>
<settlement><li>Vancouver</li>
</settlement>
<orgName><li>Université de la Colombie-Britannique</li>
</orgName>
</list>
<tree><country name="Canada"><region name="Colombie-Britannique "><name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
</region>
<name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
<name sortKey="Boutilier, Craig" sort="Boutilier, Craig" uniqKey="Boutilier C" first="Craig" last="Boutilier">Craig Boutilier</name>
<name sortKey="Dearden, Richard" sort="Dearden, Richard" uniqKey="Dearden R" first="Richard" last="Dearden">Richard Dearden</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/TlfNancyV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000118 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000118 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= TlfNancyV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:A82685ED61F587C143A9F32E055C63BDAEF7DE44 |texte= Abstraction and approximate decision-theoretic planning }}
This area was generated with Dilib version V0.6.39. |